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(57) Abstract 

In digital video synchronisation, the trend in delay is monitored to enable a prediction to be made of the ^ropping or repeati^^ of 
a video fieiriTieaudirtime compression or expansion is then initiated a selected time inteival m advance of the field the droppmg or 
repeating. The loss of audio synchronisation can in this way be kept below perceptible thresholds. 
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■lU ^PRnVEMENT n<^ PPI ATiNG TO AliniO-VIPEO PEU\Y 

This invention relates to the processing of audio and video signals. 
A common problem in the broadcast environnnent is the difference in 
delay experienced by the audio and video processes. With many new digital 
video processes in the signal chain and the consequential need for re- 
synchronisation, the delay of the video may often differ significantly from that 
of the audio. This causes the well known lip-sync en-or problem. 

A video synchroniser operates by re-timing and, where necessary, 
either dropping or Inserting fields. Each video synchroniser may therefore 
add up to 40ms of video delay in a 50 field per second system such as PAL 
or 34ms in NTSC or other 60Hz systems. The precise delay will depend on 
often arbitrary system clocks. In new installations the audio is often 
"embedded" in the video channel and so when passing through the 
synchroniser it will experience the same delay as the video. 

Since a discontinuity of 40ms (or 34ms at 60Hz) is unacceptable In 
audio, a temporary loss of lip sync is inevitable. In order to recover lip sync, 
the audio signal is time compressed or expanded for a period of time. This 
period of time must be sufficiently long that the resultant pitch change or other 
degradation remains imperceptible and will amount to several seconds. Over 
these several seconds, the loss of lip sync can be seriously objectionable. 

It can be shown that a loss of synchronisation in which the audio 
arrives eariy. is particularly noticeable. It has been suggested that in the case 
of the audio being eariier than the video a delay much above 10ms is 
perceptible. In the opposite sense, with the audio being delayed, a delay of 
up to 30ms may be tolerated. 

It is an object of the present Invention to provide an improved method 
of managing differential delay of digital audio and vWeo signals. 

Accordingly, the present invention consists in one aspect in a digital 
video synchronisation process in which digital video and audio signals are 
delayed by the same varying amount to ensure synchronisation with a timing 
reference and, on dropping or repeating of a video field, the audio signal is 
time compressed or expanded to recover audio/video synchronisation over a 
time period governed by the maxinnum acceptable pitch change or other 
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degradation, characterised in that the trend in delay is monitored to enable 
prediction of the dropping or repeating of a video field and in that said time 
compression or expansion Is initiated a selected time interval in advance 
thereof. 

5 In one form of the invention, this time interval is selected to minimise 

the perceived loss of audio/video synchronisation. That is to say. regard is 
had to the asymmetry in the thresholds for perceived loss of synchronisation 
for advanced and retarded audio. 

The invention will now be described by way of example wHh reference 
10 to the accompanying drawing which is a block diagram of a digital video 
synchroniser according to the invention. 

The video synchroniser shown in the drawing receives a digital video 
input signal with embedded audio and provides as an output synchronised 
with an externally generated timing reference, a digital video signal, again 

15 with embedded audio. 

Digital video first passes into block 10; this extracts the audio data and 
the timing signals. Video passes to the main memory 12 to be delayed until it 
is co-timed with the output timing signal extracted by block 14 from the 
reference input. Block 16 measures the delay between the input and output 

20 timing signals and passes this figure to a microprocessor 18. The calculated 
delay is passed to the audio data memory 20. The audio data and video data 
memories now give an identical delay. A re-sampling digital filter 22 alters the 
audio data sampling rate to match the outgoing video in order that the audio 
data can be synchronously inserted by block 24 into the outgoing video data 

25 stream. 

A synchroniser such as this, when Its input and output timing 
references are of a different frequency, as is usual, will occasionally drop or 
repeat a video frame. In effect, it gains or loses 40ms (at 50Hz). Whereas the 
resulting picture disturbance is often imperceptible, the same is not true for 
30 audio - it is not possible to cut or add 40ms of audio without major 
disturbance. 

A typical implementation will compensate by re-sampling the audio to a 
higher or lower frequency. In the case where the vkleo memory has lost a 
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frame, the microprocessor 18 will initiate a process of reading extra audio 
samples from the audio memory 20. A prtch change results, but if this is kept 
to less than 1%. it is unlikely to be noticed. Thus, after a synchroniser drops 
or repeats a frame the audio will initially be adrift by a noticeable 40ms. After 

5 8 seconds (assuming 0.5% pitch change) the audio and video will once again 
be co-timed. As a result, there will be a perceptible en-or for up to 6 seconds. 

In the improved system according to this invention, the microprocessor 
18 will continually monitor the video delay and calculate the rate of change of 
delay over time. With the highly stable timing references that are generally in 

10 use, it will usually be possible to predict accurately the time that the video 
memory will drop or gain a frame. Where it is predicted that, at the current 
rate of change, a video frame is due to be lost in 6 seconds time, the 
processor will initiate an increase in the audio delay to give a 0.5% pitch 
change. Just before the frame loss, the audio will be 30ms (at 50Hz) delayed 

15 with respect to the video. If the above discussed thresholds for perception of 
an audio delay are correct, this loss in synchronisation is Imperceptible. 
Immediately after the frame loss the audio will be 10ms early. If the above- 
discussed thresholds for perception of an audio advance are correct, this loss 
in synchronisation is similarly imperceptible. After a further 2 seconds, the 

20 audio and video will be co-timed. Of course, the pitch change and the 
balance between worst case advance and delay may be controlled by the 
user. Use of the invention has in this example and on the assumed 
perception thresholds, replaced a synchronisation error which is perceptible 
for up to 6 seconds, by a synt^ironisation error which is not perceptible at all. 

25 When it is predicted that, at the current rate of change, a video frame 

is due to be repeated in 2 seconds time the processor will initiate an decrease 
in the audio delay to give a 0.5% pitch change. Once again the error vwll be 
imperceptible. 

It will be understood that the invention has been described by way of 
30 examples only. Thus, rate conversion is only one example of a technique for 
time compression or expansion of the audio signal. Numerous alternative 
techniques, such as silence compression, will be known to the skilled reader 
and fall within the scope of the claimed invention. 
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1 . A digital video synchronisation process in which digital video and audio 
signals are delayed by the same varying amount to ensure 
synchronisation with a timing reference and. on dropping or repeating 
of a video field, the audio signal is time compressed or expanded to 
recover audio/video synchronisation over a time period governed by 
the maximum acceptable pitch change or other degradation, 
characterised in that the trend in delay is monitored to enable 
prediction of the dropping or repeating of a video field and in that said 
time compression or expansion is initiated a selected time interval in 
advance thereof. 

2. A process according to Claim 1 . wherein said time interval is selected 
to minimise the absolute loss of audio/video synchronisation. 

3. A process according to Claim 1 , wherein said time interval is selected 
to minimise the perceived loss of audio/video synchronisation. 

4. A process according to Claim 3. wherein said time interval is selected 
such that the period for which the audio is delayed with respect to the 
video is greater than the period the period for which the audio is 
advanced with respect to the video. 
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